Reward Augmented Maximum Likelihood for Neural Structured Prediction

نویسندگان

Mohammad Norouzi

Samy Bengio

Zhifeng Chen

Navdeep Jaitly

Mike Schuster

Yonghui Wu

Dale Schuurmans

چکیده

A key problem in structured output prediction is direct optimization of the task reward function that matters for test evaluation. This paper presents a simple and computationally efficient approach to incorporate task reward into a maximum likelihood framework. We establish a connection between the log-likelihood and regularized expected reward objectives, showing that at a zero temperature, they are approximately equivalent in the vicinity of the optimal solution. We show that optimal regularized expected reward is achieved when the conditional distribution of the outputs given the inputs is proportional to their exponentiated (temperature adjusted) rewards. Based on this observation, we optimize conditional log-probability of edited outputs that are sampled proportionally to their scaled exponentiated reward. We apply this framework to optimize edit distance in the output label space. Experiments on speech recognition and machine translation for neural sequence to sequence models show notable improvements over a maximum likelihood baseline by using edit distance augmented maximum likelihood.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Softmax Q-Distribution Estimation for Structured Prediction: A Theoretical Interpretation for RAML

Reward augmented maximum likelihood (RAML), a simple and effective learning framework to directly optimize towards the reward function in structured prediction tasks, has led to a number of impressive empirical successes. RAML incorporates task-specific reward by performing maximum-likelihood updates on candidate outputs sampled according to an exponentiated payoff distribution, which gives hig...

متن کامل

Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision

Harnessing the statistical power of neural networks to perform language understanding and symbolic reasoning is difficult, when it requires executing efficient discrete operations against a large knowledge-base. In this work, we introduce a Neural Symbolic Machine (NSM), which contains (a) a neural “programmer”, i.e., a sequence-to-sequence model that maps language utterances to programs and ut...

متن کامل

Maximum Margin Reward Networks for Learning from Explicit and Implicit Supervision

Neural networks have achieved state-ofthe-art performance on several structuredoutput prediction tasks, trained in a fully supervised fashion. However, annotated examples in structured domains are often costly to obtain, which thus limits the applications of neural networks. In this work, we propose Maximum Margin Reward Networks, a neural networkbased framework that aims to learn from both exp...

متن کامل

The More the Merrier: Parameter Learning for Graphical Models with Multiple MAPs

Conditional random field (CRFs) is a popular and effective approach to structured prediction. When the underlying structure does not have a small tree-width, maximum likelihood estimation (MLE) is in general computationally hard. Discriminative methods such as Perceptron or Max-Margin Markov Networks circumvent this problem by requiring the MAP assignment only, which is often more tractable, ei...

متن کامل

SEARNN: Training RNNs with Global-Local Losses

We propose SEARNN, a novel training algorithm for recurrent neural networks (RNNs) inspired by the “learning to search” (L2S) approach to structured prediction. RNNs have been widely successful in structured prediction applications such as machine translation or parsing, and are commonly trained using maximum likelihood estimation (MLE). Unfortunately, this training loss is not always an approp...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Reward Augmented Maximum Likelihood for Neural Structured Prediction

نویسندگان

چکیده

منابع مشابه

Softmax Q-Distribution Estimation for Structured Prediction: A Theoretical Interpretation for RAML

Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision

Maximum Margin Reward Networks for Learning from Explicit and Implicit Supervision

The More the Merrier: Parameter Learning for Graphical Models with Multiple MAPs

SEARNN: Training RNNs with Global-Local Losses

عنوان ژورنال:

اشتراک گذاری